Computationally Intensive and Noisy Tasks: Co-Evolutionary Learning and Temporal Difference Learning on Backgammon

نویسنده

  • Paul J. Darwen
چکیده

The most difficult but realistic learning tasks are both noisy and computationally intensive. This paper investigates how, for a given solution representation, coevolutionary learning can achieve the highest ability from the least computation time. Using a population of Backgammon strategies, this paper examines ways to make computational costs reasonable. With the same simple architecture Gerald Tesauro used for Temporal Difference learning to create the Backgammon strategy “Pubeval”, co-evolutionary learning here creates a better player.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Why Co-Evolution beats Temporal Difference learning at Backgammon for a linear architecture, but not a non-linear architecture

The No Free Lunch theorems show that the algorithm must suit the problem. This does not answer the novice’s question: for a given problem, which algorithm to use? This paper compares co-evolutionary learning and temporal difference learning on the game of Backgammon, which (like many real-world tasks) has an element of random uncertainty. Unfortunately, to fully evaluate a single strategy using...

متن کامل

Co-Evolutionary Learning on Noisy Tasks

This paper studies the effect of noise on coevolutionary learning, using Backgammon as a typical noisy task. It might seem that co-evolutionary learning would be ill-suited to noisy tasks: genetic drift causes convergence to a population of similar individuals, and on noisy tasks it would seem to require many samples (i.e., many evaluations, and long computation time) to discern small differenc...

متن کامل

Coevolution of a Backgammon Player

One of the persistent themes in Artificial Life research is the use of co-evolutionary arms races in the development of specific and complex behaviors. However, other than Sims’s work on artificial robots, most of the work has attacked very simple games of prisoners dilemma or predator and prey. Following Tesauro’s work on TD-Gammon, we used a 4000 parameter feed-forward neural network to devel...

متن کامل

Why did TD-Gammon Work?

Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference learning for other applications or even other games. We were able to replicate some of the success of TD-Gammon, developing a competitive evaluation function on a 4000 parameter feed-forward neural network, without using back-propagation, reinforcement ...

متن کامل

Improving Temporal Difference Learning Performance in Backgammon Variants

Palamedes is an ongoing project for building expert playing bots that can play backgammon variants. As in all successful modern backgammon programs, it is based on neural networks trained using temporal difference learning. This paper improves upon the training method that we used in our previous approach for the two backgammon variants popular in Greece and neighboring countries, Plakoto and F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000